Structure of a large social network 
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We study a social network consisting of over 10 4 individuals, with a degree distribution exhibiting 
two power scaling regimes separated by a critical degree fc C rit, and a power law relation between 
degree and local clustering. We introduce a growing random model based on a local interaction 
mechanism that reproduces all of the observed scaling features and their exponents. Our results 
lend strong support to the idea that several very different networks are simultenously present in the 
human social network, and these need to be taken into account for successful modeling. 

PACS numbers: 89.75.Da, 89.75.Hc, 89.75.Fb, 89.65. Ef 
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The ubiquity of networks has long been appreciated: 
complex systems in the social and physical sciences can 
often be modelled on a graph of nodes connected by 
edges. Recently it has also been realized that many net- 
works arising in nature and society, such as neural net- 
works [y, food webs Q, cellular networks y|, networks of 
sexual relationships Q, collaborations between film ac- 
tors P, and scientists 0, Q , power grids Q, || , Internet 
routers M and links between pages of the World Wide 
Web jlOL llll IT^ I all share certain universal characteris- 
tics very poorly modelled by random graphs [l3j |: they 
are highly clustered "small worlds" 0, flil fl5| with small 
average path length between nodes, and they have many 
highly connected nodes with a degree distribution often 
following a power law 0, IT(i| . The network of humans 
with links given by acquaintance ties is one of the most 
intriguing of such networks 0, 0, 0, , but its study 
has been hindered by the absence of large reliable data 
sets. 

- The new data set The WIW project was started by 
a small group of young professionals in Budapest, Hun- 
gary in April 2002 on the web site www.wiw.hu with 
the aim to record social acquaintance. The network is 
invitation-only and new members join by an initial link 
connecting to the person who invited them. New links 
are recorded between members after mutual agreement. 
This scheme results in members preferring to use their 
real names and effectively prevents proliferation of mul- 
tiple pseudonyms. Because of the relatively short age 
of the network, links formed between people newly ac- 
quainted through the web site have a minimal structural 
effect; thus the majority of the links represent genuine 
social acquaintance and the WIW develops as a growing 



subgraph of the underlying social acquaintance network. 
Indeed, the growth process of the WIW network is es- 
sentially equivalent to the "snowball sampling" method 
well known to sociologists 01 j an d to the crawling meth- 
ods used to investigate the World Wide Web and other 
computer networks. We study the WIW network using 
two anonymous snapshots taken in October 2002 (with 
12388 nodes and 74495 links) and January 2003 (with 
17496 nodes and 127190 links). 

The degree distribution of the WIW network is plotted 
on Figure^ The graph shows two power law regimes 
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The two regimes are separated by a critical degree fc cr it ~ 
25. The exponent 72 ~ —2 of the large-fc power law 
falls in a range that has often been observed before in 
a variety of contexts 0, 0, IM 0, S S E3, E3, 
The value 71 « —1 of the small-A; power law exponent 
is much less common, observed before only in some sci- 
entific collaboration networks and food webs [3. The 
possibility of a double power law was discussed in 7, m, 
but the WIW network is the first data set which conclu- 
sively demonstrates the existence of double power law be- 
haviour. The two snapshots give essentially identical dis- 
tributions. Since the network grew by about 50% during 
this period, the described distribution can be regarded 
as essentially stationary in time. 

The two scaling regimes in the degree distribution of 
the WIW graph are indicative of two distinct growth pro- 
cesses: the invitation of new members, and the record- 
ing of acquaintance between already registered members. 
The degree distribution of the invitation tree graph is 
shown on Figure [21 where a power law is observed for 
large degrees with an exponent 7 w —3. Since this dis- 
tribution is qualitatively different from the total degree 
distribution, it is reasonable to conclude that there are 
at least two different types of social linking at play here: 
the network of friends defined by ties strong enough to 
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FIG. 1: The degree distribution of the WIW network (dia- 
monds), with a small-fc power law Pik) ~ k~ 10 and a large- 
k power law P(k) ~ fc~ 20 separated by a critical degree 
fc cr it » 25. The solid line gives the degree distribution of 
the model of the text with edge/node ratio m — 15, q = 0.5 
and size V = 2 x 10°, averaged over 50 graphs. 




FIG. 3: The correlation between the local clustering coeffi- 
cient C(k) and the node degree k for the WIW graph (di- 
amonds), showing a power law C(k) ~ fc-°' 33 . The solid 
line plots the same for the model graph with parameters 
V — 2 • 10 5 , m — 15 and critical q = 0.5, averaged over 
50 graphs. 




FIG. 2: The degree distribution of the invitation tree of the 
WIW graph (diamonds) exhibits a large-fc power law P(k) ~ 
k~ s . Also plotted is the invitation tree of the model graph 
with V — 2 ■ 10 5 nodes, m — 15 and different parameters q. 

warrant an invitation is different from the network of 
acquaintances that drives the mutual recognition, once 
both parties are registered. 

The density of edges in the neighbourhood of a node is 
measured by the local clustering coefficient. For a node 
v of degree k, the local clustering coefficient C(v) is the 
number of acquaintance triangles of which v is a vertex, 
divided by k(k— 1)/2, the number of all possible triangles. 
Figure plots C(k), the average of C{v) over nodes of 
degree k, against the degree, showing the existence of a 
power law 

C(k) ~ fc-°- 33 . 

A relationship C(k) ~ fc _Q was observed before in pH 
l2(ij |. but with significantly larger exponents. Such power 



laws hint at the presence of hierarchical architecture |2fJ : 
when small groups organize into increasingly larger 
groups in a hierarchical manner, the local clustering de- 
creases on different scales according to such a power law. 

The average clustering coefficient (C) « 0.2 is obtained 
as the average of C(v) over all nodes. The average diam- 
eter between two members of the WIW network is about 
4.5. These two measures indicate the "small- world" na- 
ture of the WIW network in the sense of 

— Time development As Figure 0] shows, the number of 
nodes of the WIW network grew approximately linearly 
with time. This appears to be related to the fact that the 
WIW network develops as a subgraph of the underlying 
social network, and thus the availability of new members 
is constrained by high clustering of the existing social 
links: a substantial proportion of the aquaintances of a 
newly invited member will have been invited already. 

On the other hand, the number of edges also grew lin- 
early with time, and thus the edge/node ratio only grew 
moderately during the existence of the network. This ob- 
servation is in contradiction with any purely local time- 
independent edge creation mechanism. If every member 
of the network actively participates in edge creation in- 
depently of its age in the network, the edge/node ratio 
would also increase linearly with time. This linear growth 
of the edge/node ratio was not observed in the network, 
and hence we conclude that the edge creatioon activity 
of members necessarily decreased with time. 

The fact that the edge/node ratio changes little over 
time is consistent with the observed stationary nature of 
the degree distribution. To see this, consider a grow- 
ing network with V(t) nodes at time t and a time- 
independent degree distribution P(k) with J^k = 1 
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FIG. 4: The time development of the number of nodes and 
edges of the WIW network. Note that the number of nodes 
is multiplied by 3 for better visibility. 

and finite first moment 2^ fe kP(k). At time t, there are 

n(k,t) = V(t)P(k) 
nodes of degree k 7 and hence the number of edges is 

L k z k 

Consequently the edge/node ratio E(t)/V(t) is essen- 
tially constant, and it only changes because the maximal 
degree increases. This argument applies to the WIW net- 
work with stationary distribution 

1 "1^12 if A > k cl n 

with 71 w —1, 72 ~ —2. This distribution is on the 
boundary of distributions with finite first moment: the 
first moment exists for 72 < —2 but not otherwise. 

— Network modelling A random graph process based on 
linear preferential attachment for the creation of new 
edges was proposed in ,5| to account for the observed 
power laws in natural networks. Such a process leads 
indeed to a graph with a power law degree distribu- 
tion HE]]. However, this model is by definition macro- 
scopic, requiring information about the entire network in 
every step. This assumption is realistic for the World 
Wide Web or some collaboration networks, where all 
nodes are "visible" from all others. For human social net- 
works however, it is reasonable to assume some degree of 
locality in the interactions. Also, the original scale-free 
models are not applicable to networks with high, degree 
dependent clustering coefficients. These problems mo- 
tivated the introduction of new models which use local 
triangle creation mechanisms 0, |2^, |2^, 0, |25| , which 
increase clustering in the network. These models have 
degree-dependent local clustering, and can also lead to 
power law degree distributions, though no existing model 
of this kind shows a double power law. 



We now present a new model to account for the ob- 
served properties of the WIW network. As mentioned 
above, the WIW can be viewed as a growing subgraph 
of the underlying social acquiantance graph. This sug- 
gests a model obtained by a two-step process, first mod- 
elling the underlying graph, and then implementing a 
growth process. The lack of available data on the un- 
derlying graph however prevents us from following this 
programme directly. We build instead a growing graph 
in one single process, choosing the local triangle mech- 
anism as our basic edge creation method. This models 
the social introduction of two members of the WIW net- 
work by a common friend some time in the past, such 
edges being gradually recorded in the WIW network it- 
self. The invitation of new members is modelled by sub- 
linear preferential attachment [5j, motivated by experi- 
mental results on scientific collaboration networks 0,H^|, 
where the data permits the analysis of initial edge for- 
mation. We also impose the constant edge/node ratio to 
be consistent with the observed stationary distribution. 
Note that this constant has to be tuned from the shape 
of observed distributions, and cannot be inferred from 
the WIW data directly. The reason for this is that the 
WIW has a disproportionate number of nodes of degree 
one (Figure nj, representing people who once responded 
to the invitation but never returned, which distorts the 
edge/node ratio without invalidating our other conclu- 
sions. 

The precise description of our process is as follows. 

• We begin with a small regular graph. 

• New nodes arrive at a rate of one per unit time 
and are attached to an earlier node chosen with a 
probability distribution giving weight k q to a node 
of degree k, where q > is a parameter. 

• Internal edges are created as follows: we select two 
random neighbours of a randomly chosen node v, 
and if they are unconnected, we create an edge be- 
tween them. Otherwise, we select two new neigh- 
bours of the same node v and try again. 

• A constant number of internal edges is created per 
unit time, so that the edge/node ratio equals a con- 
stant m after each time step. 

The degree distribution of graphs generated by our 
process is shown on Figure [S] We found a very robust 
large k power law of exponent 72 ~ —2, essentially in- 
dependently of the invitation mechanism. We measured 
the joint probability distribution of the degrees k,k' of 
nodes connected by new internal edges, and found that 
for large values, it was proportional to kk' . This directly 
leads to a power law exponent of —2 via standard mean- 
field arguments • The small k behaviour was found to 
be sensitive to the invitation mechanism; Figure [S] shows 
that a second power law only appears at a critical q. The 
critial value of q depends on the edge/node ratio. 
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FIG. 5: The dependence of the degree distribution of our 
model graph on the parameter q, with m = 15 and V = 2 • 10 
(q — 0.5, 1) and V = 5 ■ 10 4 (q = 0), averaged over 50 graphs. 

To test the hypothesis that the low degree power law 
is indeed related to the invitation mechanism, we plot on 
Figure 13 the degree distribution of the invitation tree of 
the model network for various values of the parameter q. 
At the critical value, we obtain a scale-free distribution 
with exponent —3. Decreasing q leads to a much sharper 
drop in the curve, with an exponential tail for q = 0, 
whereas increasing q above the critical value leads to a 
gelation-type behaviour: new nodes connect only to very 
large degree nodes. 

Figure ^ shows that, for appropriate choice of param- 
eters, the degree distribution of our model reproduces 



that of the real network extremely well. Figure plots 
the dependence of the local clustering cofficient C{k) as a 
function of the degree k in our model network. While no 
simple power law can be observed, there is a clear trend 
of decreasing clustering with increasing degree, indicative 
of the presence of hierarchy in the model network [23] • 

— Conclusion We have presented and analyzed a large 
new data set of a human acquintance network with a 
stable degree distribution which exhibits a new feature: 
two power law regimes with different exponents. The ob- 
served approximately constant edge/node ratio is a con- 
sequence of the stability of the degree distribution, and 
implies that the average activity of members is time de- 
pendent, whereas the growth of the number of nodes is 
constrained by the underlying social network. We also 
introduced a model which reproduces the observed de- 
gree distribution extremely well, and concluded that the 
small-fc power law is a related to the scale-free nature of 
the invitation tree, whereas the large-fc power law is a 
result of the triangle mechanism of social introductions. 
Our results show that human social networks are likely 
to be composed of several networks with different char- 
acteristics, and directly observable processes will exhibit 
a mixture of features resulting from distinct underlying 
mechanisms. 
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